Genomics, Proteomics & Bioinformatics
◐ Oxford University Press (OUP)
Preprints posted in the last 90 days, ranked by how well they match Genomics, Proteomics & Bioinformatics's content profile, based on 171 papers previously published here. The average preprint has a 0.29% match score for this journal, so anything above that is already an above-average fit.
Kambara, K.; Chen, Q.; Tsugama, D.
Show abstract
Grass Expression Atlas (GExA) is an interactive web-based resource for rapid exploration of gene expression across diverse tissues, developmental stages, and conditions in grass species. GExA integrates publicly available RNA sequencing (RNA-seq) datasets for four millets: pearl millet (Cenchrus americanus), foxtail millet (Setaria italica), proso millet (Panicum miliaceum), and finger millet (Eleusine coracana), and includes barley (Hordeum vulgare) and sorghum (Sorghum bicolor) as reference species. Datasets were processed using a unified processing workflow to generate expression values in transcripts per million (TPM). The current release comprises 4,673 samples from 442 BioProjects, including 987 pearl millet samples and 2,216 foxtail millet samples, and is provided through a user-friendly web interface. GExA is designed for scalable expansion to additional species via the pipeline used in this study. GExA is freely available at https://webpark2116.sakura.ne.jp/RNADB.
Catalan, P. R.; Mu, W.; Liu, J.
Show abstract
Polyploidization plays a fundamental role in plant evolution and crop domestication. However, due to the high similarity of genomic sequences between some homologous or homeologous chromosomes, the assembly of some polyploid genomes is extremely difficult, frequently resulting in erroneous assemblies, such as sequence chimeras and sequence collapse. The genus Brachypodium is an important model system for the study of polyploidy in grasses. However, high-quality reference genomes are still lacking for its complex polyploid perennial species. In this study, we developed a bioinformatic pipeline for the accurate assembly of high-quality reference genomes at the chromosomal level for two representative perennial Brachypodium species with conflicting collapsed segments, the allotetraploid B. phoenicoides (2n = 4x = 28) and the autohexaploid B. boissieri (2n = 6x = 48). We developed an innovative methodology (CollapsedChrom) that uses depth-of-read profiling and relies on prior karyotypic information to systematically detect and rescue collapsed regions. This depth-sensitive curation strategy successfully recovered 328.9 Mb and 195.8 Mb of previously collapsed sequences in the genomes of B. phoenicoides and B. boissieri, respectively. Comprehensive quality assessments demonstrated the high quality of our final assemblies. Our chromosomal-level assemblies fully capture the genomic architectures of these species. These robust genomic resources overcome long-standing challenges in polyploid assembly and provide an essential foundation for future research on the evolutionary dynamics, subgenomic interactions, and functional biology of complex polyploid plant genomes.
Dai, B.; Liang, Y.; Yi, L.; Hu, P.; Song, Y.; Qian, B.; He, M.; Wang, L.; Yuan, Z.; Zuo, Y.
Show abstract
Deciphering ultra-large-scale omics data with minimal resources while maintaining high computational efficiency is a longstanding challenge in biology. Here, we present Local Pooling (LP), a lightweight, ultrafast and general framework that leverage neighbor-indexing strategy and local pooling module to generate omics embedding and compatible with variety of downstream analyses. We developed its adaptation for spatial omics, SpaLP, and evaluated it on over 20 large-scale datasets spanning 9 technology platforms. SpaLP consistently outperformed baseline methods across multiple benchmarks, including niche identification, expression reconstruction, multiple slices integration, 3D organ construction, multi-omics integration and cross-platform generalization. Notably, SpaLP processed a 1.35-million-cell mouse embryo slice in just 47 seconds, achieving up to 300-fold increase in computational efficiency compared to Graph Neural Network (GNN)-based methods. Meanwhile, SpaLP increased the average adjusted rand index (ARI) by over 30% for niche identification in simulated and realistic settings. Furthermore, we applied SpaLP to integrate 8.4 million mouse brain cells within 4 minutes on a single GPU and constructed a 3D spatial atlas. Finally, we explored SpaLPs ability of cross-platform generalization and potential for developing an omics foundation model. As a novel and general framework, we believe that LP could help more researchers develop new model on large-scale data and overcome the research barriers caused by computing resources in more fields.
Pan, B.-Z.; Zhang, X.; Hu, X.-D.; Fu, Q.; Chen, M.-S.; Tao, Y.-B.; Niu, L.-J.; He, H.; Shen, Y.; Cheng, Z.; Lang, T.; Liu, C.; Xu, Z.-F.
Show abstract
Sacha inchi (Plukenetia volubilis L.) is an emerging woody oilseed crop prized for its high alpha-linolenic acid (ALA) content. Despite its nutritional and economic value, the lack of high-quality genomic resources has hindered genetic improvement and the elucidation of its unique polyunsaturated fatty acid and lipid biosynthetic pathways. In this study, we report a high-quality, chromosome-scale genome assembly of sacha inchi with a total length of 710.62 Mb, integrated from Illumina, PacBio, and chromosome conformation capture (Hi-C) technology. The genome harbors 37,570 protein-coding genes, and 379.86 Mb (53.45%) of repetitive sequences. Phylogenomic analysis reveals that sacha inchi diverged from its closest relative Ricinus communis, [~] approximately 36.2 million years ago. Comparative genomics indicates that sacha inchi experienced only ancient whole genome duplication events. To elucidate the mechanisms governing ALA biosynthesis and triacylglycerol (TAG) accumulation in sacha inchi seeds, we performed temporal transcriptome profiling across six seed development stages. Our findings demonstrate that high TAG content is primarily driven by the sustained expression of biosynthetic genes and low activity of degradation genes during mid-to-late seed development. Notably, while genes encoding stearoyl-ACP desaturases (SADs) maintain the precursor pool, the expression of genes encoding fatty-acid desaturase 2 (FAD2) and fatty-acid desaturase 3 (FAD3) is positively correlated with the final accumulation of C18:2 and C18:3 fatty acids. We also identified lncRNAs as potential epigenetic regulators of these key pathways. This high-quality genome provides a critical foundation for elucidating the molecular mechanisms of seed growth and development in sacha inchi.
Pazicky, S.; Dziekan, J. M.; Tjia, S.; Bopp, S.; Wirth, D.; Bozdech, Z.
Show abstract
Spreading resistance to clinically used antimalarial drugs increases the need for identification of new drug targets. Here, we screened 25 antimalarials with known and unknown mode of action to validate old and find new drug targets in P. falciparum. Combining experimental approach by proteome-wide cellular thermal shift assay and computational filtering by molecular docking, we found the drugs to bind to previously known drug targets, validated ACS10 as the target of MMV665915 and discovered new drug targets. Furthermore, we updated the experimental and analytical MS-CETSA pipeline with the inclusion of membrane solubilization step, validated the targets of atovaquone and cipargamin on cell lysates and in intact cells and mapped the parasite response to these antimalarials, identifying monocarboxylate transporter MCP2 as atovaquone transporter.
Wang, B.; Wan, S.; Zhang, P.; Zhang, Y.; Wang, X.; Dong, L.; Ye, K.; Yang, X.
Show abstract
The complete assembly of the human Y chromosome remains a challenge due to its highly repetitive and complex structure. While complete telomere-to-telomere (T2T) assemblies have been generated for a few individuals, such high-quality resources for East Asian populations, particularly for well-characterized multi-omics reference cohorts, are still scarce. The Chinese Quartet, comprising monozygotic twin daughters and their parents, is a premier reference material for genomic studies, yet a T2T-level Y chromosome assembly for this pedigree was lacking. Here, we present a complete, gapless T2T assembly of the Y chromosome (designated CQ-chrY) from the father of the Chinese Quartet. This assembly was generated by integrating Oxford Nanopore ultra-long reads, PacBio HiFi reads, and Hi-C data, resulting in a sequence of 61.88 Mb. The assembly shows exceptional base accuracy (QV = 51.09) and structural completeness (GCI = 100; CRAQ AQI = 95.217). We completely resolved the 33.52 Mb Yq12 heterochromatic region and annotated 164 protein-coding genes and 51.03 Mb (82.47%) of repetitive sequences. This CQ-chrY assembly represents the third complete Chinese Y chromosome and fills the last gap in the T2T assemblies of the Quartet family, providing an invaluable paternal haplotype resource for expanding East Asian genomic standards and for studies on Y chromosome structural variation and evolution.
Qin, X.; Wen, B.; He, P.; Chen, Z.; Tan, S.; Mao, Z.
Show abstract
Osteoporosis affects millions of women globally. In this study, we applied bioinformatics methods to screen for novel diagnostic biomarkers of osteoporosis in women using the GSE62402 and GSE56814 datasets. PCSK5, ZNF225, and H1FX were used to construct a diagnostic model. ROC, calibration, and decision curve analyses were performed to assess the diagnostic performance on the training (GSE56814) and external (GSE56815) datasets. The expression level of model genes was validated in GEO datasets. Furthermore, five transcription factors (ETS1, NOTCH1, MAZ, ERG, and FLI1) were identified as common upstream regulators of model genes. PCSK5, ZNF225, and H1FX serve as novel diagnostic biomarkers, providing new insights into the pathogenesis of and treatment strategies for osteoporosis in women.
Wang, L.; Qu, R.; Huang, Q.; Hu, M.; Chen, T.
Show abstract
Tumor heterogeneity highlights the necessity of precision cancer medicine, making the evaluation and screening of anticancer drugs a core challenge in cancer therapy. However, current cell-based efficacy assessment methods struggle to quantify the holistic impact of drugs on cellular behavior through specific target engagement. Here, we proposed a novel approach (DL-TCP-FRET) that integrates phenotypic and target-related evaluations: the logistic fitting analysis is performed on time- and concentration-dependent cellular phenotypic characteristics to construct a phenotypic score (P), while a target score (T) is established based on the FRET efficiency between target proteins. These two scores were then further combined to generate a unified drug efficacy score (PT). Validation in A549 cells demonstrated that our method can reliably distinguish EGFR-TKIs from non-targeted drugs. DL-TCP-FRET simplifies the experimental workflow of drug efficacy evaluation and improves the accuracy of targeted drug identification, providing a novel strategy for advancing precision cancer therapy.
Li, S.; Chou, E.; Wang, K.; Boyle, A. P.; Sartor, M. A.
Show abstract
Mapping the genomic locations and patterns of transcription factor binding sites (TFBS) is essential for understanding gene regulation and advancing treatments for diseases driven by DNA modifications, including epigenetic changes and sequence variants. Although several TFBS databases exist, no study has systematically benchmarked these databases across different sequencing technologies and computational algorithms. In this study, we addressed this gap by constructing a TFBS database that integrates all available ENCODE cell line ATAC-seq and Cistrome Data Browser ChIP-seq datasets, comprising 11.3 million human and 1.87 million mouse TFBS. We also integrated previously published TFBS resources (Factorbook, Unibind, RegulomeDB, and ENCODE_footprint) and found each contains a substantial fraction of unique TFBS predictions, highlighting significant discrepancies among existing resources. To assess the accuracy of the combined TFBS regions, we assembled ten independent genomic annotation datasets for evaluation and found that TFBS regions predicted by multiple databases are more likely to represent true and biologically meaningful binding sites. For each predicted TFBS region, we define two scores: the confidence score reflects prediction reliability, while the importance score represents biological functional relevance. Finally, we introduce TFBSpedia, a lightweight and efficient search engine that enables rapid retrieval of TFBS regions and comprehensive annotation information across the integrated databases.
Yao, F.; He, J.; Nyaruaba, R.; Chen, F.; Zhou, J.; Yang, H.; Wei, H.; Li, Y.
Show abstract
Microorganisms significantly influence human health, and dysbiosis of the oral microbiome plays a critical role in the development and progression of both oral and systemic diseases. This highlights the urgent need for novel therapeutics targeting specific pathogens. Here, we presented a structure-based pipeline to efficiently identify potential phage-derived periodontal lysins (LysPds) from nearly one million proteins. We predicted the structures of candidate lysins using AlphaFold2 and developed an innovative structure-based similarity network to classify them into distinct clusters, each with unique functional properties. A systematic characterization of 16 representative LysPds from 11 superfamilies revealed that over 90% demonstrated potent antibacterial activity against key periodontal pathogens. Among these, LysPd078 was identified as a promising preclinical drug candidate, effectively reconfiguring microbiome communities while demonstrating significant efficacy and safety in mouse models of periodontitis and calvarial infection. Our findings highlight the effectiveness of structure-based similarity networks in exploring vast protein spaces and underscore the potential of LysPd078 as a targeted modulating agent for the oral microbiome.
Mai, G.; Dai, Y.
Show abstract
This study introduces a one-stop analysis platform named "PathoResistAI" (https://www.resistpath.com/), which can be used to solve the technical bottlenecks of pathogenic microorganism detection and antimicrobial resistance analysis. The platform is based on nanopore sequencing and the innovative all-ratio algorithm, which integrates four-dimensional parameters (sequence similarity, abundance, matching number, and matching length), significantly improving the detection accuracy of low-abundance pathogens and drug-resistance genes. The platform adopts four layers of modular design (input layer, core engine, dual-channel output, and visualization layer). Users only need to upload data in FASTQ format, and they can obtain automated reports, including pathogen identification and drug-resistance gene prediction within 30 min. The verification results show that the platform can accurately identify bacteria (e.g., Staphylococcus aureus and Serratia marcescens), viruses (e.g., Ebola virus), and drug-resistance genes (e.g., SdeY), which are consistent with the published literature results. Limitations include only supporting long-read sequencing data, small sample size (fewer than 50 cases), and lack of real clinical sample verification. In general, this platform represents the application and exploration of nanopore sequencing in the field of rapid detection of pathogenic microorganisms, and provides a new tool for microbial pathogen or AMR detection research.
Gao, Q.; Song, Y.; Yang, Y.; Wang, S.; Ruan, X.; Liu, Z.; Guo, D.; Chen, Y.; Wang, X.; Chen, R.; Xu, H.; Lin, F.
Show abstract
In agriculture, propiconazole (PCZ) controls excessive growth in flowering Chinese cabbage but poses dietary safety risks due to residue accumulation. Therefore, identifying novel PCZ targets and breeding PCZ-free cultivars is critical for the safe production of flowering Chinese cabbage. Here, we identified three P4-ATPase flippase homologs aminophospholipid ATPase 3 (BraALA3a/b/c) in flowering Chinese cabbage that function as sensitive targets for PCZ. These proteins exhibit high binding affinity for PCZ, which directly inhibits their ATPase activity. Overexpression of the BraALA3 homologs enhanced plant growth and increased sensitivity to PCZ, whereas knockdown led to dwarfism and reduced sensitivity. Based on these findings, we identified editable active sites via protoplast-based screening. Genetic transformation of one such site yielded BraALA3a/braala3aK200T mutant lines, which displayed a dwarf and compact architecture. These findings provide a precise molecular target for developing PCZ-free germplasm in flowering Chinese cabbage through gene editing.
Lin, Y.; Guo, Q.; Xu, X.; Gu, H.; Hu, M.; Wu, Y.; Wu, Y.; Meng, L.; Ye, G.
Show abstract
Increasing attention is being focused on the glycemic index (GI) of daily food for humans, and the resistant starch content (RSC) is an important indicator of GI for starch-rich staple foods. In recent years, some studies revealed that the loss function of single or multiple key enzymes in the primary pathway of starch synthesis substantially increases RSC in rice, such as starch branching enzyme IIb (BEIIb) and soluble starch synthase IIIa (OsSSIIIa). However, a noteworthy negative characteristic of these high RSC mutants is the substantially increased amylose content (AC). AC as a major determinator of rice eating quality, must not be higher than an acceptable limit for most consumers. To solve this problem, in this study, we adopted two promoter editing (PE) editing strategies to develop rice germplasms with a better balance of RSC and AC: one is to edit the promoter of BEIIb in a low AC rice variety, another is to edit the promoter of Waxy (Wx) gene in a BEIIb loss of function mutant. Using AC[≤]20%, which is the range of premium quality rice in China as a criteria, we finally obtained 2 homozygous lines with significantly increased RSC ([≥]5%) in the NG46 background by promoter editing of BEIIb and 1 homozygous line in the YouTang2 (YT2, a BEIIb mutant) background by promoter editing of Wx gene. Further analysis revealed that AC and the amount of long-chain branches of amylopectin are positively correlated with RSC in the population of BEIIb PE lines. However, unexpectedly, the Wx PE-line with substantially decreased AC (17.7%) also showed significantly increased RSC (16.9%). Our study not only produces useful germplasms for the high RSC rice breeding in the future but also provides an insight into understanding the relationship between AC and RSC in defective BEIIb rice.
Zhou, D.; Fiches, G. N.; Wu, Z.; Eleya, S.; Park, Y.; He, J.; Shanaka, K. A.; Lepcha, T. T.; Liu, Y.; Oliva, J.; Lurain, K.; Jung, J. U.; Qi, J.; Zhao, W.; Zhu, J.; Santoso, N. G.
Show abstract
Summary/AbstractHistone methylation is a dynamic and reversible epigenetic modification that critically controls the progression of human diseases, including infections and cancers. Here we reported that histone lysine demethylases (KDMs) in the KDM5 family KDM5A/B play profound roles in suppressing lytic reactivation of oncogenic human herpesvirus 8 (HHV-8), i.e., Kaposis sarcoma-associated herpesvirus (KSHV), as well as antiviral/antitumor innate immune responses in KSHV-infected B-cell lymphomas. We showed that KSHV lytic replication decreases KDM5A/B protein stability by enhancing their K-48 linked polyubiquitination while KDM5A/B depletion facilitates KSHV lytic reactivation. Mechanistic studies illustrated that KDM5A/B associate with KSHV LANA protein and dampen its chromatin association at both KSHV viral lytic promoter and promoters of antitumor immune-responsive genes (IRGs). In comparisons to normal B cells, KDM5A/B expression significantly increased in B-cell lymphoma cells, including KSHV-positive primary effusion lymphoma (PEL). We demonstrated that KDM5A/B inhibition remarkably induces both KSHV lytic reactivation and innate immune responses in PEL cells, resulting in a strong viral oncolytic effect, both in vitro in cell cultures and in vivo using a PEL xenograft mouse model. Overall, our studies identified the novel functions of KDM5A/B to silence KSHV lytic replication and antiviral/antitumor innate immune responses, which can be blocked to benefit the treatment of KSHV-associated B-cell lymphomas that are usually aggressive and difficult to treat.
Wen, K.; Zha, J.; Chen, S.; Zhong, J.; Yuan, L.; Cui, Y.; Shi, X.; Qin, W.; Lan, X.; Liu, Y.; Yang, X.; Qin, H.; Li, M.; Guo, P.; Xiao, Q.; Wu, T.; Zhou, Y.; Cao, C.; Ning, S.; Wu, C.; Gao, Q.; He, H.; Ma, Y.; An, Z.; Liu, X.; Chen, Y.; Zheng, Z.; Wei, H.; Ma, Y.; Zhang, J.
Show abstract
Coherent Ising machines (CIMs) excel at solving large-scale combinational optimization problems (COPs), but their insufficient long-term stability has hindered their applications in compute-intensive tasks like computer-aided drug discovery (CADD). By improving fiber vibration isolation and temperature control system, we have implemented a 2000-node CIM named QBoson-CPQC-3Gen achieving stable solutions over one hour on large-scale COPs. Graph-based encoding schemes were further introduced to realize a CIM-based CADD workflow including allosteric site detection, protein-peptide docking and intermolecular similarity calculation. CIM-based methods demonstrated superior speed and accuracy than heuristic algorithms. Especially, QBoson-CPQC-3Gen identified 2 novel druggable sites and bioactive compounds for 6 targets, which were further validated in vitro, in-cell and by crystal structures. Our contributions established a quantum-computing framework for multi-stage drug discovery, representing a significant advancement in both quantum computing applications and pharmaceutical research.
Choi, S.; Lee, N.; Jeon, H.; Park, J.; Kim, S.; Kim, J.-E.; Shin, J.; Moon, H.; Min, K.; Choi, Y.; Hwangbo, A.; Kim, H.; Choi, G. J.; Lee, Y.-W.; Song, D.-G.; Son, H.
Show abstract
O_LIWD40 is a highly conserved protein domain in eukaryotes, playing a critical role in various cellular process. C_LIO_LIWe conducted genome-wide functional analysis of WD40 genes in Fusarium graminearum--a phytopathogenic fungus that causes severe yield loss and mycotoxin contamination in major cereal crops. C_LIO_LIComprehensive phenome analysis of 119 WD40 gene deletion mutants across 22 distinct phenotypic traits revealed phenotypic divergence within the phenome, establishing a strong correlation between virulence and sexual reproduction. Notably, 21 "core WD40 genes" were identified, offering valuable insights into divergent biological processes. C_LIO_LIPilot interactome studies of Fgwd101 and Fgwd133 provided further insights into their potential pathobiological functions. Our investigation contributes to broadening our knowledge of the biological mechanisms underlying fungal pathogenesis and may assist in the identification of targets for antifungal agents. C_LI
Simmons, J. R.; Xue, T.; McCord, R. P.; Wang, J.
Show abstract
Programmed DNA elimination (PDE) is a notable exception to genome integrity, characterized by significant DNA loss during development. In many nematodes, PDE is initiated by DNA double-strand breaks (DSBs), which lead to chromosome fragmentation and subsequent DNA loss. However, the mechanism of nematode programmed DNA breakage remains largely unclear. Interestingly, in the human and pig parasitic nematode Ascaris, no conserved motif or sequence structures are present at chromosomal breakage regions (CBRs), suggesting the recognition of CBRs may be sequence-independent. Using Hi-C, we revealed that Ascaris CBRs engage in three-dimensional (3D) interactions before PDE, indicating that physical contacts between break regions may contribute to the PDE process. The 3D interactions are established in both Ascaris male and female germlines, demonstrating inherent genome organization associated with the CBRs and to-be-eliminated sequences. In contrast, in the unichromosomal horse parasite Parascaris univalens, transient pairwise interactions between neighboring CBRs that will form the ends of future somatic chromosomes were observed only during PDE. Intriguingly, we found that Ascaris PDE, which converts 24 germline chromosomes into 36 somatic ones, induces specific compartmentalization changes. Remarkably, Parascaris PDE generates the same set of 36 somatic chromosomes, and the 3D compartment changes following PDE are consistent between the two species. Overall, our findings suggest that CBRs spatially demarcate the retained and eliminated DNA and may contribute to their spatial organization during Ascaris PDE. We also demonstrated that the 3D genome reorganization of the somatic chromosomes in these nematodes following PDE is evolutionary and developmentally conserved.
Kawasaki, R.; Takemoto, K.; Hamano, M.
Show abstract
Direct reprogramming (DR) converts somatic cells directly into target cell types while bypassing an intermediate pluripotent state, such as induced pluripotent stem cells. In practice, DR is achieved by transfecting multiple transcription factors (TFs); prior research has shown that combining microRNAs (miRNAs) with TFs further improves reprogramming efficiency. However, experimentally identifying effective TFs and miRNA combinations is difficult and costly, underscoring the need for robust in silico prediction approaches. We developed a graph neural network-based method to predict TFs that induce DR across diverse human cell types while explicitly modeling miRNA-mediated transcriptional regulation. By constructing a gene regulatory network integrating TF-target gene, TF-miRNA, miRNA-target gene, and gene-gene interactions, we implemented a Graph Attention Network v2 that predicts DR-inducing TFs while learning interaction importance and capturing transcriptional activation and repression. This approach outperformed existing methods in predicting experimentally validated DR-inducing TFs. Moreover, high-ranking predictions for previously unexplored tissues included TFs known to be associated with the development of the corresponding tissues, supporting the biological relevance of the results. Overall, the proposed method provides a practical in regenerative medicine.
Zhang, C.; Li, J.; Luo, O.; Andrews, T.; Steinberg, G. R.; WANG, D.
Show abstract
The liver acts as a central metabolic hub, integrating systemic signals through a spatially organized pattern known as zonation, driven by the coordinated activity of diverse cell types including hepatocytes, stellate cells, Kupffer cells, endothelial cells, and immune populations. Spatial transcriptomics (ST) enables the profiling of thousands of cells with spatial resolution in a single experiment, facilitating the identification of novel gene markers, cell types, cellular states, and tissue neighborhoods across diverse tissues and organisms. By simultaneously capturing transcriptional and spatial heterogeneity, ST has become a powerful tool for understanding cellular and tissue biology. Given its advantages, there is growing demand for applying ST to uncover novel biological insights in the liver under various physiological and pathological conditions including obesity, diabetes, and metabolic dysfunction-associated steatotic liver disease (MASLD). However, to date no comprehensive and practical protocols currently exist for analyzing ST data specifically in the context of liver metabolism. Herein, we present a systematic and detailed protocol for ST data analysis using liver tissues from MASLD mouse models. This guide offers practical support for metabolic based researchers without advanced expertise in coding, mathematics and statistics enabling single-cell RNA-seq referencing for deconvolution-based annotation, curated liver cell type markers for manual annotation, and a GMT file of metabolic gene sets and flux balance analysis to analyze liver metabolic activity. This framework and integrated computational resources for decoding metabolic reprogramming and cellular heterogeneity will empower researchers to uncover novel biological pathways regulating liver metabolism in health and disease.
Asawa, R.; Hazzard, B.; Tebben, K.; Tan, J.; Cantaert, T.; Berry, A. A.; Tolia, N. H.; Popovici, J.; Serre, D.
Show abstract
Plasmodium vivax is the second most prevalent Plasmodium species, with 2.5 billion people at risk of infection worldwide and around 10 million cases of clinical vivax malaria every year. Despite the clinical importance of this pathogen, very little is known about the P. vivax proteins recognized by the host immune system, which hinders our ability to select vaccine candidates or develop efficient serological markers. To comprehensively characterize immunogenic P. vivax proteins, we designed a high-density peptide array containing 4.2 million peptides covering the entire protein sequence of all P. vivax genes and analyzed antibody responses of infected and malaria-naive individuals. We identified a total of 283 proteins that are commonly immunogenic in symptomatic individuals. These proteins included most proteins known to be involved in erythrocyte invasion, a putative new invasion protein, several nucleoporins, and many uncharacterized proteins that should be further investigated for their roles during blood-stage infections. These analyses also revealed a unique pattern of antibody response against PIR proteins in asymptomatic individuals, that could be associated with protection against clinical vivax malaria. Overall, these data provide an agnostic and comprehensive perspective on immunogenic P. vivax proteins and constitute an important resource for the malaria community to develop new tools for better detecting and eliminating this important human pathogen.